Information escaping the correlation hierarchy of the convergence field in the study of 

cosmological parameters. 
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Using fits to numerical simulations, we show that the entire hierarchy of moments quickly ceases to 
provide a complete description of the convergence one-point probability density function leaving the 
linear regime. This suggests that the full N-point correlation function hierarchy of the convergence 
field becomes quickly generically incomplete and a very poor cosmological probe on nonlinear scales. 
At the scale of unit variance, only 5% of the Fisher information content of the one-point probability 
density function is still contained in its hierarchy of moments, making clear that information escaping 
the hierarchy is a far stronger effect than information propagating to higher order moments. It 
follows that the constraints on cosmological parameters achievable through extraction of the entire 
hierarchy become suboptimal by large amounts. A simple logarithmic mapping makes the moment 
hierarchy well suited again for parameter extraction. 
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Introduction N-point correlation functions, first in- 
troduced in cosmology by Peebles and collaborators to 
describe the large scale distribution of galaxies [l|, are 
now ubiquitous in this field. They are at the heart of 
many cosmological probes like the CMB, galaxy cluster- 
ing, or notably weak lensing, which was recognized as 
one of the most promising probe of the dark components 
of the universe [2|4fj| , and which traces the cosmological 
convergence field. 

On large scales, or in the linear regime, correlations are 
a particularly convenient approach to tackle the difficult 
problem of statistical inference on cosmological param- 
eters. Indeed, primordial cosmological fluctuation fields 
are believed to obey Gaussian statistics, and the first 
two members of the hierarchy, the mean and the two- 
point correlation function, provide a complete descrip- 
tion of such fields. However, much less is known about 
the pertinence of the correlation hierarchy in the non- 
linear regime, or on small scales, where in principle a lot 
of information is contained, if only due to the large num- 
ber of modes available for the analysis. More elaborated 
statistical models must be made in this regime. For in- 
stance, the statistics of the matter field and its weighted 
projection the convergence field were shown to be closer 
to lognormal, at least in low dimensional settings @-t3) 
though with sizeeable deviations still. 
Two effects relevant for statistical inference can in princi- 
ple play a role entering the non linear regime, departing 
from Gaussian initial conditions. First, information may 
propagate to higher order correlators. Second, the cor- 
relation function hierarchy may not provide a complete 
description of the field anymore , so that information es- 
capes the hierarchy. Even though this second possibility 
was pointed out qualitatively in an astrophysical context 
already in Q , it seems it was not given further attention 
in the literature. 

In this Letter we show, using accurate fits of the conver- 



gence one-point probability density function to numerical 
simulations [9] that the second effect very quickly com- 
pletely dominates the convergence field, and thus that the 
hierarchy is not well suited for inference on cosmological 
parameters anymore. 

Fisher information and orthogonal polynomials. The 
approach is based on decomposing the Fisher's matrix 
valued information measure in components unambigu- 
ously associated to the independent information content 
of the correlations of a given order. It was recently pro- 



posed in [1QJ, building upon [ill]. Exact results at all 
orders were obtained only for the moment hierarchy of a 
idealized, perfectly lognormal one dimensional variable, 
where analytical methods could be applied. In cosmol- 
ogy, the Fisher information matrix is widely used for 
many years now to estimate the accuracy with which 
cosmological parameters will be extracted from future 
experiments aimed at some observables [3, |]J Qjl, e.g.], 
assuming Gaussian statistics. 

For a general probability density function p(x, a, ft), a, 
ft, ■ ■ ■ any model parameters, its definition is 
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Its inverse can be seen through the Cramer-Rao bound 
12| to be the best covariance matrix of the relevant pa- 
rameters achievable with the help of unbiased estimators. 
The general procedure to decompose the Fisher informa- 
tion content into uncorrelated pieces, corresponding to an 
ort hog onal system, was presented in a statistical journal 
in [ll( . When the observables of interest are products of 
the variables, i.e. moments or more generally correlation 
functions, the orthogonal system are orthogonal polyno- 
mials. It is discussed in detail in an cosmological context 
in [10j . In particular, the variables for which the Fisher 
information content on a is entirely within the first N 
pieces, such as the Gaussian variables for N — 2, are 
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those for which the function d a \np entering (p}, called 
the score function, is a polynomial of order N in x. In 
the case of a single variable, the uncorrelated contribu- 
tion of order N to the Fisher information matrix F a p is 
given by 



s N (a)s N (f3), 



(2) 



where the Fisher information coefficients s n are the com- 
ponents of the score function with respect to the or- 
thonormal polynomial of order N, 



s N (a) 



<9 lnp 

da 



Pn{x) 



(P n (x)P m (x)) = S mn , n,m>0 



For any TV, the following relation holds 
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where rrii — (a; 1 ) and = ra^j — mirrij is the covari- 
ance matrix. The right hand side being the expression 
describing the Fisher information content of the moments 
mi to rriN- Whether one recovers the full matrix F a p 
with TV — > oo or only parts of it depends on the distribu- 
tion under consideration. A sufficient condition is that 
the polynomials P n form a complete basis set, which is 
then essentially equivalent to the condition that the dis- 
tribution can uniquely be recovered from its moments 
hierarchy 0, and references therein] . This and other 
sufficient criteria for completeness are tightly linked to 
the decay rate of the probability density function at in- 
finity. 

We define the cumulative efficiency e^v of the moments 
up to order N to capture Fisher information on a as 
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From the Cramer-Rao bound, */ijv is the ratio of the 
the best constraints achievable on a with any unbiased 
estimator to the expected constraints on a from the ex- 
traction of the first N moments . 

Fisher information coefficients We use the fits to sim- 
ulations from Q, valid down to the arcsecond scales. Ini- 
tially built to correct for the failure of the lognormal dis- 
tribution to reproduce the high and low density tails of 
the convergence k on a single lens plane, it reproduces 
accurately the cosmological convergence as well, taking 
into account the broader lensing kernel [l3j]. In terms of 
the reduced variable x, 
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FIG. 1. The three parameters Z,A and ui 2 entering the gen- 
eralized lognormal model, as function of the variance of 5^, . 



where K omp t y is the minimal value of the convergence, 
corresponding to a light ray traveling an empty region, it 
takes the form of a generalized lognormal model for the 
associated effective matter fluctuations 



p(x,a) 



■ exp 
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(8) 



In this equation, the three parameters Z, A and ui 
are such that the mean of x is unity, and its variance 
/ K em P ty ( we are neglecting here a small but non- 
zero mean of the convergence argued in [13]). There- 
fore, the only relevant parameter is the variance of the 
associated matter fluctuations a 2 , fixed by the cosmol- 
ogy from Kempty and the convergence power spectrum, 
together with some filter function corresponding to the 
smoothing scale, determining the level of non linearity of 
the field 9, figure 1 ]. Linear and non-linear regime being 
separated at a 2 « 1. We obtained Z,A and u 2 , shown 
in figure [T] with the help of a standard implementation 
of the Newton- Raphson method for non-linear systems of 
equations. 

Orthogonal polynomials can very conveniently be gener- 
ated by recursion, as exposed in details in [14( , since they 
satisfy a three terms recurrence formula. We define for 
convenience 



ttn{x) := \Jp(x, a)ix N (x), 



(9) 



where ttn(x) is Pn(x) rescaled such that the coefficient 
of 2:" is unity. The recursion relations in |14| become 



ak := 



Pk := 



(x - a k )^k - flk^k-i, 

J™dxXTT 2 (x) 
J™dXTT 2 (x) 
J^dxiT 2 (x) 



(10) 
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FIG. 2. The cumulative efficiency ejv of the moments of the 
convergence in capturing Fisher information, for N = 2 to 
N = 5, defined in eq. (|6]), from bottom to top, as function of 
the variance of <5ff. 



and 7r_i(x) = 0, ni(x) =1,/3q= 1, that we implemented 
using an appropriate discretization of the x-axis. Proper 
normalization of the polynomials can be performed af- 
terwards. The Fisher information coefficients were then 
obtained with the help of equation (j3|), using a precise 
five point finite difference method for the derivatives of 
Z, A and u> 2 with respect to a 2 that are needed to obtain 
the score function. 

In figure [H we show the cumulative efficiency e^cr 2 ), for 
N = 2 to N = 5, from bottom to top. (Note that s^a 2 ) 
vanishes since the mean of x is unity for any value of 
the variance) . The uppermost line contains therefore the 
variance, the skewness, the kurtosis as well as the 5th 
moment of the field. The contribution of each succes- 
sive moment can be read out from the difference between 
the corresponding successive curves. For higher N quick 
convergence of ejy occurs, presented in figure [3] as the 
solid line, showing eio. For small values of the variance, 
the field is still close to Gaussian, so that the Fisher in- 
formation is close to be entirely within the the 2nd mo- 
ment, and accordingly the ratio e is close to unity in this 
regime. It is obvious from these figures that the main 
effect for larger values of the variance is not that Fisher 
information is transferred to higher order moments, but 
rather the dramatic cutoff as soon as the variance crosses 
0.1. At redshift 1, this corresponds to the scale of « V 
!9i, figure 1 ], so still within scales probed by weak tens- 
ing. For <t ~ 1, the ratio is close to 0.05, meaning that 
all moments completely fails to capture the information. 
Optimal constraints on any cosmological parameter en- 
tering a are thus for this value of the variance a factor 
1/V0.05 ~ 4.5 tighter than those achievable with the 
help of the entire hierarchy. 

In figure |3] we compare these results to the exact ana- 



lytical expressions given in [l(]| for the lognormal distri- 
bution, shown as the dashed line. These are given by, 
accounting for the different normalization, 
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with q := 1/(1 + a ). The total Fisher information con- 
tent being in this case (q/\xxq) 2 /2 — <? 2 /(41n<2). There 
also the information content of the moments saturates 
quickly as N grows. It is striking that the incomplete- 
ness of the moment hierarchy occurs much earlier in the 
convergence field than in the lognormal. This can be un- 
derstood from the following considerations. The main 
effect of the improved model © for the convergence is to 
reproduce accurately the very sharp cutoff of the proba- 
bility density function at low convergence values [{§ fig- 
ure 3-6]. This cutoff is very sensitive to the variance of 
the field, more sensitive than the cutoff of the lognormal. 
However, there the contribution to the moment m n , x n , 
is beaten down by orders of magnitude. To make this 
point clearer, we show in figure |4] the Fisher informa- 
tion density p (<9 CT 2 hip) 2 for the lognormal distribution 
(dashed) and the model we used (solid), at the scale of 
a = 1 It is obvious in both cases that a large fraction of 
the information is contained in the underdense regions, 
describing the cutoff of the distribution, but unaccessible 
to the moments of x. Since this is even more the case for 
the convergence field, the efficiency is accordingly even 
worse. 

Restoration of the information Finally we investigate 
to what extent the moment hierarchy of In a; contains 
more Fisher information than the hierarchy of x. Though 
our method is completely independent, this can be seen 
as complementary to recent works looking at the statis- 
tics of the field after local transforms, and at the statisti- 

iM3, 



even 



cal power of its power spectrum initiated in 
though in these works the fact that information actually 
completely escapes the hierarchy is not appreciated. This 
is done with the very same method used above, by ob- 
taining the polynomials orthogonal to the distribution of 
In a;, or equivalently decomposing the score function of x 
in polynomials in In a; rather than in x. This is seen to 
perform very well, as shown by the dotted lines in figure 
131 From bottom to top are plotted £1,62 and e 3 . Also 
shown in the figure is eio but it is not to be distinguished 
from unity, meaning that completeness of the hierarchy 
is restored. We see that over the full range at least 80% 
of the information is back in the two first moments, and 
95% in the first three. 

Conclusions We have studied the statistical power 
of the moment hierarchy of the convergence field, when 
leaving the linear regime. Notably, the hierarchy ceases 
to provide a complete description of the statistics of 
the convergence, letting an increasingly large fraction 
of the Fisher information actually escape the hierarchy, 
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FIG. 3. Solid and dashed line : the efficiency ejv=io of the first 
10 moments to capture Fisher information, for the conver- 
gence field (solid) and lognormal field (dashed). The curves 
do not change anymore with increasing N. Dotted : ei,£2, 
£3 and eio for the logarithmic transform of the field, from 
bottom to top. 
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FIG. 4. The Fisher information density of the lognormal 
(dashed) and the convergence field (solid), renormalized such 
that it integrates to unity. Clearly, the Fisher information 
is mostly contained within the underdense regions. The mo- 
ments are however sensitive to the tail. 



and thus making constraints on cosmological parameters 
achievable with measurements of the hierarchy subopti- 
mal by increasingly large factors. While our results are 
exact only for the one point distribution (or equivalently 
the full correlation function hierarchy of the convergence 
field in the limit of vanishing correlations), the correla- 
tion function hierarchy will also show a similar behavior, 
though the amplitude of the loss in information and con- 
straining power may vary from parameter to parameter 
in the details. This is because this defect, for any number 
of variables, is due to the very slow decay rate at infinity 



of the field distribution, which cannot be reproduced by 
the exponential of a polynomial in the relevant variables. 
Our findings are consistent with previous analytical re- 
sults on the lognormal distribution [Io| . and numerical 
work from JV-body simulations at the power spectrum 
level [HEi]. Making a tighter connection to such sim- 
ulation results with the methods presented here is the 
subject of future work. Of course, the quest for the in- 
formation in the non linear regime already has problems 
of its own, such as shot noise issues, or accurate modeling, 
that we did not consider here. Nonetheless, these results 
clearly shows that if the correlation function hierarchy is 
to play a substantial role in getting constraints out of the 
mildly or non-linear regime, then an approach similar to 
a Gaussianizing transform [la . [17 1 , in this work a simple 
logarithmic mapping, can hardly be avoided though the 
details still needs to be figured out. It is reassuring that 
this approach seems to work well to first order, and that 
first steps have recently already been taken in that di- 
rection in perturbation theory [18|, for the matter field. 
Our work also points toward low convergence regions as 
carrying large amounts of information, though the impor- 
tance of noise issues needs to be clarified in this regime. 
Thus, many promising ways have still to be explored to 
make profit of mildly and non-linear scales. 
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